Bioinformatics for RNA‐Seq Data Analysis
نویسندگان
چکیده
While RNA sequencing (RNA‐seq) has become increasingly popular for transcrip‐ tome profiling, the analysis of the massive amount of data generated by large‐scale RNA‐seq still remains a challenge. RNA‐seq data analyses typically consist of (1) accurate mapping of millions of short sequencing reads to a reference genome, including the identification of splicing events; (2) quantifying expression levels of genes, transcripts, and exons; (3) differential analysis of gene expression among different biological conditions; and (4) biological interpretation of differentially expressed genes. Despite the fact that multiple algorithms pertinent to basic analyses have been developed, there are still a variety of unresolved questions. In this chapter, we review the main tools and algorithms currently available for RNA‐seq data analyses, and our goal is to help RNA‐seq data analysts to make an informed choice of tools in practical RNA‐seq data analysis. In the meantime, RNA‐seq is evolving rapidly, and newer sequencing technologies are briefly introduced, including stranded RNA‐seq, targeted RNA‐seq, and single‐cell RNA‐seq.
منابع مشابه
Falco: a quick and flexible single-cell RNA-seq processing framework on the cloud
Summary Single-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellization of existing RNA-seq processing pipelines using big data technologies of Apache Ha...
متن کاملpowsimR: power analysis for bulk and single cell RNA-seq experiments
Summary Power analysis is essential to optimize the design of RNA-seq experiments and to assess and compare the power to detect differentially expressed genes in RNA-seq data. PowsimR is a flexible tool to simulate and evaluate differential expression from bulk and especially single-cell RNA-seq data making it suitable for a priori and posterior power analyses. Availability and implementation...
متن کاملcompcodeR - an R package for benchmarking differential expression methods for RNA-seq data
UNLABELLED compcodeR is an R package for benchmarking of differential expression analysis methods, in particular, methods developed for analyzing RNA-seq data. The package provides functionality for simulating realistic RNA-seq count datasets, an interface to several of the most commonly used differential expression analysis methods and extensive functionality for evaluating and comparing diffe...
متن کاملRNA-Rocket: an RNA-Seq analysis resource for infectious disease research
MOTIVATION RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wis...
متن کاملPolyester: simulating RNA-seq datasets with differential transcript expression
MOTIVATION Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. RES...
متن کامل